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ABSTRACT 

Range-space  methods  for  convex  quadratic  programming  improve  in  efficiency  as  the  number 
of  constraints  active  at  the  solution  decreases.  This  paper  describes  in  detail  three  alternative 
range-space  methods,  each  based  upon  updating  different  matrix  factorisations.  It  is  shown  that 
the  choice  of  method  depends  upon  the  relative  importance  of  storage  needs,  computational 
efficiency  and  numerical  reliability. 

The  updating  methods  described  are  applicable  to  both  primal  and  dual  quadratic  program¬ 
ming  algorithms  that  use  an  active- set  strategy. 
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1.  Introduction 


1.  Introduction 

This  paper  ia  concerned  with  methods  for  solving  tin  convex  quadratic  programming  problem  — 
the  minimisation  of  a  positive-definite  quadratic  form  subject  to  a  set  of  linear  constraints  on  the 
variables.  We  shall  assume  the  problem  to  be  stated  in  the  following  form: 


QP  minimise  eTx  -f  \zTHz 

*e»*  2 

subject  to  t  <  At  <  u, 


where  H  is  a  symmetric  positive-definite  matrix,  and  A  is  m  X  ».  (The  methods  to  be  described 
may  be  applied  directly  to  equality  constraints,  which  are  omitted  for  simplicity.  The  adaptation 
of  one  of  the  methods  to  exploit  the  structure  of  simple  bound  constraints  is  described  in  Gill  et 
a/.,  1982.)  The  vector  function  g{x)  =  e+ Hz  will  denote  the  gradient  of  the  quadratic  objective 
function. 

Apart  from  the  requirement  of  feasibility,  the  optimality  conditions  for  problem  QP  involve 
only  the  constraints  active  (exactly  satisfied)  at  the  solution.  Let  x*  denote  the  (necessarily 
unique)  solution  of  QP.  Let  A*  denote  the  matrix  whose  rows  are  the  active  constraints,  and  6* 
the  vector  of  corresponding  components  of  l  or  u,  so  that  A***  =  b*,  with  strict  inequality  in  all 
other  constraints.  Let  L  denote  the  set  of  rows  in  A*  corresponding  to  active  lower  bounds,  and 
U  the  set  of  rows  in  A*  corresponding  to  active  upper  bounds.  There  must  exist  a  vector  X*  such 
that 

A*V  a  c  +  Hz*  =  y(**),  (1.1) 

where 

>  o,  *  e  L;  x*  <  o,  ieu. 

Active-set  methods  are  based  on  developing  a  prediction  of  the  active  set  (the  working  set), 
which  includes  the  constraints  exactly  satisfied  at  the  current  point  (see,  e.g.,  Gill  and  Murray, 
1977).  Let  x*  denote  the  current  iterate,  and  its  gradient;  the  t*  rows  of  the  matrix  A*  are 
defined  as  the  coefficients  of  the  constraints  in  the  working  set,  and  the  vector  ft*  is  composed  of 
the  corresponding  components  of  i  and  u  (so  that  A*x*  =  6*).  The  search  direction  p*  is  chosen 
so  that  x*  +  Pk  is  the  solution  of  a  quadratic  programming  problem  with  the  original  objective 
function,  subject  to  the  equality  constraints  of  the  working  set.  With  this  definition,  Pk  is  itself 
the  solution  of  the  quadratic  program 


minimise  gTp  +  zPTHp  „ 

2  (1.2) 

subject  to  AkP  =  0. 

The  optimality  conditions  for  (1.2)  imply  that  there  exists  a  vector  of  Lagrange  multipliers 
(denoted  by  X*)  such  that 

=  ».+»!>.•  0-») 

If  ta  is  feasible  and  A*  =  A*  it  follows  from  (1.1)  and  (1.8)  that  +  pk  =  i  and  X*  =  X*. 
Otherwise,  the  working  set  is  changed  by  adding  or  deleting  constraints  until  the  correct  active 
set  is  determined. 


Range-Space  Methods  for  Can  rex  Quadratic  Programming 


It  is  useful  to  classify  active-set  QP  methods  as  either  range-space  or  null-space  methods. 
This  terminology  arises  because  the  working  set  can  be  viewed  as  defining  two  complementary 
subspaces:  the  range  space  of  vectors  that  can  be  expressed  as  linear  combinations  of  the  rows  of 
Ak,  and  the  null  space  of  vectors  orthogonal  to  the  columns  of  A*.  Each  iteration  of  a  quadratic 
programming  method  involves  computing  p*  (and  sometimes  X*),  and  in  many  cases  the  work 
required  in  an  iteration  is  directly  proportional  to  the  dimension  of  either  the  range  space  or 
the  null  space.  For  example,  the  methods  of  Murray  (1971),  Gill  and  Murray  (1978),  Bunch 
and  Kaufman  (1980)  and  Powell  (1981)  are  null-space  methods,  and  are  most  efficient  when  the 
number  of  constraints  in  the  working  set  is  close  to  n,  since  the  dimension  of  the  null  space  is 
then  relatively  small.  (Some  methods  cannot  be  categorised  as  either  range-space  methods  or 
null-space  methods.  See,  for  example,  the  methods  proposed  by  Bartels,  Golub  and  Saunders, 
1970,  Fletcher,  1971,  and  Goldfarb  and  Idnani,  1981.) 

This  paper  is  concerned  with  range-space  methods,  which  increase  in  efficiency  as  the  number 
of  constraints  in  the  working  set  decreases.  In  Sections  2,  3  and  4  we  describe  three  alternative 
range-space  methods.  The  differences  in  the  methods  arise  because  of  the  different  emphasis  placed 
on  the  relative  importance  of  storage  needs,  computational  efficiency  and  numerical  reliability. 
In  all  cases  we  shall  discuss  primarily  the  details  of  how  to  compute  p*  and  X*,  and  not  the 
various  strategies  for  altering  the  working  set.  The  techniques  described  may  be  applied  in  the 
implementation  of  both  primal  and  dual  quadratic  programming  algorithms  that  use  an  active-set 
strategy. 

As  general  background  for  range-space  methods,  we  observe  that  the  constraints  and  op¬ 
timality  conditions  for  problem  (1.3)  imply  that  p*  and  X*  satisfy  the  n  -f- tk  linear  equations 

(rfxnH?). 

which  may  be  solved  in  several  different  ways.  The  range-space  approach  takes  advantage  of 
the  case  where  H  is  non-singular.  If  A*  has  full  rank,  pk  and  X*  are  defined  by  the  following 
range-space  equations: 


AkH-lA.l\k  =  AhH~',k; 

and 

Hpk  —  Ak\k  —  gk, 

which  underlie  all  the  methods  of  this  paper. 


(1.4) 

(1.5) 


2.  Method  It  •  low-storage  method 

Probably  the  most  straightforward  range- space  method  is  obtained  by  utilising  the  Cholesky 
factors  of  H  and  AkH~xAl  to  solve  the  range-space  equations  (1.4)  and  (1.5)  directly.  This 
method  has  been  suggested  with  respect  to  general  linearly  constrained  problems  by  Gill  and 
Murray  (1974,  pp.  77-78)  and  Goldfarb  (1976),  and  with  respect  to  quadratic  programming  by 
Dax  (1981). 

We  shall  denote  the  required  factorisations  as 

H  =  RtR  and  AhH~x Aj  =  U\U k,  (2.1) 

where  R  and  Uk  are  upper-triangular  matrices  of  dimension  n  X  n  and  tk  X  tk  respectively. 


2 .  A  low-storage  method 


i 


Note  that  R  needs  to  be  computed  only  once,  and  may  be  overwritten  on  H.  In  this  section, 
all  computations  involving  H  are  actually  performed  using  R,  but  are  represented  in  terms  of 
H  for  simplicity.  The  matrix  AkH~xAl  changes  in  a  special  way  as  constraints  enter  and  leave 
the  working  set.  The  associated  update  procedures  for  t/*  will  be  summarised  in  this  section; 
complete  details  are  given  in  the  Appendix  (see  also  Du,  1981). 

The  fundamental  relationship  connecting  the  updated  Cholesky  factors  is  the  following. 
Consider  the  positive-definite  symmetric  matrix  M  with  Cholesky  factors  UTU,  and  the  related 
positive-definite  matrix  M  defined  by 


where  m  is  a  column  vector  and  p  a  scalar.  The  Cholesky  factor  0  of  St  is  given  by 

0  =  ^  ^  where  UTu  =  m  and  p3  =  p  —  uru  (2.3) 

(see  Wilkinson,  1965,  pp.  229-230  and  Section  A.2.1  of  the  Appendix).  (If  M  is  positive  definite, 
p  —  uTu  >  0.) 

For  the  remainder  of  this  section  we  shall  drop  the  subscript  k,  so  that,  for  example,  A 
denotes  the  matrix  of  constraints  in  the  working  set.  Barred  quantities  will  be  associated  with 
a  single  change  in  the  working  set.  We  shall  show  that  the  vector  X  can  be  updated  after  every 
change  in  the  working  set  with  very  little  work  beyond  that  required  to  update  the  Cholesky 
factor  U.  To  demonstrate  this,  we  require  the  following  lemma,  which  we  state  without  proof 
(see,  for  example,  Cottle,  1974). 


Lemma  1.  Let  M  (M  —  UTU)  be  a  positive-definite  symmetric  matrix  and  b  a  rector,  and  let 
x  denote  the  solution  of  Mx  =  b.  Consider  the  augmented  matrix  M  (M  =  VW)  defined  by 
(2.2)  and  (2.3),  and  the  rector  l  defined  by 


l 


(2.4) 


where  P  is  a  scalar.  If  M  is  non-singular,  then  the  solution  of  the  system  Mt  =  J  is  given  by 


x  —  (v 


) 


where 

UTu  =  m,  Uv  =  u,  p2  —  p  —  uru  and  (  =  — — .  (2.5) 

r 

I 

Adding  a  constraint  to  the  working  set.  When  a  constraint  is  added  to  the  working  set,  a  new 
row  (say,  ar)  becomes  the  last  row  of  A.  The  matrix  AH~lAT  then  has  the  form  (2.2),  with 


M  =  AH~iAT,  m  —  AH~la,  and  p  =  aTH~l a,  (2.6) 


so  that  its  Cholesky  factor  0  is  given  by  (2.3). 


Range-Space  Methods  for  Convex  Quadratic  Programming 


I 


Given  the  updated  Cholesky  factor  0  following  a  constraint  addition,  it  is  easy  to  obtain  the 
updated  vector  of  Lagrange  multipliers,  as  shown  in  the  following  theorem. 

Theorem  1.  Let  X  and  p  denote  the  solutions  to  the  range-space  equations  (1.4)  and  (i.5)  at 
the  point  x.  Let  £  (£  =  *  +  ap)  be  a  point  at  which  the  row  aT  is  added  to  A.  Assume  that 
the  updated  Cholesky  factor  V  of  AH~lAT  has  been  computed ,  so  that  u  and  p  of  (2.3)  are 
available.  Then  the  Lagrange  multiplier  vector  that  solves  (1.4)  at  £  is  given  by  the  vector  X, 
whose  elements  are  defined  by 

Xy  =  Xj  -  {Vj,  j  =  1,2, and  \t+l  =  £,  (2.7) 

where  v  and  £  are  defined  by 


Uv  =  u 


and 


(a  —  1  )oTp 
P2 


Proof.  By  construction  of  p  and  definition  of  g, 

AH~lg  =  AH~\g  +  aHp)  =  AH~'g  +  aAp  =  AH~lg.  (2.8) 

Hence,  Lemma  1  can  be  applied  to  the  equations  tor  X  and  X,  and  the  form  (2.7)  follows  directly. 
Since  P  in  (2.4)  is  given  by  aTH~lg,  it  follows  from  (2.5)  that  the  scalar  £  is  given  by 

(r 

Substituting  the  expression  g  +  qHp  for  g  and  the  definition  of  m  from  (2.6),  the  numerator  of 
(2.9)  becomes 

aTH~lg  —  mT\  —  a  TH~l(g  +  aHp )  —  a  TH~lAT\ 

=  aTH~l{aHp  -  AT\  +  y). 

Using  (1.5),  this  expression  further  simplifies  to  (a  —  l)arp.  Hence,  (==  (a  —  l)orp/p2.  | 

Theorem  1  shows  that  £ta  multiplications  are  required  to  update  X  when  a  constraint  is  added 
to  the  working  set. 

Deleting  a  constraint  from  the  working  set.  First,  suppose  that  A  is  given  by  A  with  its  last  row 
(denoted  by  oT)  deleted.  In  this  case,  the  relationship  between  AH~lAT  and  AH~lAT  is  the 
reverse  of  (2.2),  and  hence  the  new  Cholesky  factor  is  simply  a  submatrix  of  the  old,  i.e. 

*-(:;>  <2io) 

The  following  theorem  shows  that  the  updated  multiplier  vector  can  be  computed  with  little  effort 
in  this  case. 


2.  A  Jaw-storage  method 


s 


Theorem  3.  Let  X  and  p  denote  the  solutions  to  the  range-space  equations  at  the  point  x.  Suppose 
that  the  constraint  corresponding  to  the  last  row  of  A  is  deleted  from  the  working  set  at  the  point 
t  =  x  -j-  p.  If  A  denotes  the  matrix  of  constraints  in  the  working  set  after  the  row  has  been 
deleted,  0  denotes  the  Cholesky  factor  of  AH  ~lAT,  and  0  denotes  the  rector  of  the  first  t  —  1 
elements  in  the  last  column  of  U,  the  rector  of  multipliers  \  computed  at  Z  with  the  new  working 
set  is  giren  by 

—  X/  "4"  Xfffy,  j  1,2, ... ,t  1, 

where  ff  satisfies  Uv  =  0. 

Proof.  Our  assumption  about  the  position  of  the  deleted  row  means  that 


ath-u = ( 

V  aTH~lAT  aTH~la  ) 


(2.11) 


Since  X  solves  (1.4)  and  (2.8)  holds,  we  may  apply  Lemma  1  with  A H~lAT  as  A 4  and  AH~lAT 
as  Sf.  This  gives 

>-(sr> 

where 

AH~lAT0  =  AH~la. 

Because  UTU  =  AH~lAT,  (2.10)  and  (2.11)  imply  that  VTg  =  AH~la-,  since  VTV  -  AH~lAT, 
it  follows  that  V  may  be  obtained  by  solving  the  triangular  system  Ov  —  0.  | 

Theorem  2  implies  that  the  vector  \  may  be  obtained  in  £t*  multiplications. 

In  general,  a  row  may  be  deleted  from  aqy  position  within  A,  so  that  Theorem  2  cannot 
be  applied  directly.  In  this  case,  the  rows  of  A  are  permuted  to  move  the  deleted  row  to  the 
last  position  and  a  sequence  of  plane  rotations  is  applied  to  restore  the  permuted  form  of  If  to 
upper-triangular  form  (see  Section  A.l  of  the  Appendix).  In  this  case,  the  relationship  between 
U  and  0  is  of  the  form 


where  Q  is  a  sequence  of  plane  rotations  and  II  is  a  permutation  matrix.  The  number  of 
multiplications  required  to  perform  these  rotations  when  the  j-th  constraint  is  deleted  is 
(we  assume  the  three-multiplication  form  of  the  plane  rotations;  see  Gill  et  a/.,  1974).  Thus,  the 
essential  features  of  Theorem  2  remain  unchanged,  the  only  difference  being  the  need  to  reorder 
the  elements  of  X  before  updating. 


Discussion  of  the  low* storage  method.  Two  advantages  of  this  method  are  the  relative  simplicity 
of  the  associated  updates  and  the  low  storage  requirements.  Since  R  may  be  overwritten  on  the 
space  required  to  store  H,  the  storage  space  required  for  this  method  (beyond  that  needed  to 
represent  the  problem)  is  of  the  order  of  jt*  M  locations,  where  (BU  is  the  maximum  number  of 
constraints  in  the  working  set.  In  the  case  where  there  are  always  few  constraints  in  the  working 
set,  the  method  requires  very  little  extra  storage.  The  disadvantage  of  the  method  lies  in  the 
possible  magnification  of  any  ill-conditioning  in  A  and  H  during  the  implicit  formation  of  the 
product  AH~lAT. 

The  following  table  summarises  the  number  of  multiplications  needed  to  perform  each  com¬ 
putation  in  the  low-storage  method. 


R*age-Sp*ce  Method*  for  Con  rex  Quadratic  Programming 


Update  X 

Compute  p 

Add  constraint 

Delete  *-th  constraint 

»*+nt 

»*+»t+|ta 

*(‘-0* 

8.  Method  2:  a  batter  conditioned  method 

The  range-space  equations  (1.4)  and  (1.5)  are  not  the  best  representation  of  pk  and  Xk  for  purposes 
of  computation,  since  formation  of  the  matrix  AkH-1Ak  may  exacerbate  the  conditioning  of  the 
original  problem.  This  difficulty  may  be  lessened  by  rewriting  the  equations  in  terms  of  the  LY 
factorisation  of  Ak,  i.e., 

Ak  =  LhYl,  (3.1) 

where  the  t  columns  of  Yk  are  orthonormal  n-vectors,  and  L*  is  a  t  X  t  lower-triangular  matrix. 
(The  columns  of  %  form  an  orthonormal  basis  for  the  range  space  of  the  rows  of  Ak.) 
Substituting  (3.1)  into  (1.4)  and  (1.5),  we  obtain 

Lk  Xk  =  9k  (3.2) 

HPk  =  Ykh  —  9k,  (3-3) 

where  9k  satisfies 

rfH-Mv.-rjH-*#,.  (3.4) 

Equations  (3.2)  and  (3.3)  are  theoretically  equivalent  to  (1.4)  and  (1.5);  however,  they  are  likely 
to  have  superior  numerical  properties,  since  the  orthogonality  of  the  columns  of  Yk  means  that 
the  condition  number  of  YlH~lYk  is  no  greater  than  that  of  H.  The  use  of  the  factorisation 
(3.1)  to  derive  a  better  conditioned  projection  matrix  was  suggested  by  Gill  and  Murray  (1974, 
pp.  73-79)  in  the  context  of  general  linearly  constrained  problems;  see  also  Goldfarb  (1976). 

Computation  of  Xk  and  pk  from  (3.2)  and  (3.3)  may  be  implemented  by  storing  the  Cholesky 
factors  of  H  (which  need  be  computed  only  once),  and  updating  the  LY  factorisation  of  Ak  and 
the  Cholesky  factors  of  YkH~vYk  as  the  working  set  changes.  A  detailed  description  of  the 
update  procedures  is  given  in  Section  A.2  of  the  Appendix.  For  simplicity,  we  shall  drop  the 
subscript  k  in  the  rest  of  this  section;  barred  quantities  will  denote  those  resulting  from  a  single 
change  in  the  working  set.  Let  U  denote  the  Cholesky  factor  of  YTH~lY,  i.e., 

YTH~lY  =  UTU. 

Adding  a  constraint  to  the  working  set.  When  a  constraint  is  added  to  the  working  set,  a  new 
row  (say,  or)  is  added  to  A.  The  effect  of  this  change  on  the  LY  factorisation  is  that  a  new 
column  is  added  to  Y,  so  that  T  is  given  by 

?  =  {Y  z). 

In  this  case,  the  matrix  YT H~lY  is  of  the  augmented  form  M  (2.2),  with 

M  =  YtH~1Y,  m  =  YtH~1z  and  (i  =  zrH~1z. 

Thus,  exactly  the  same  update  procedures  described  in  Theorem  2  can  be  applied  to  compute  the 
updated  Cholesky  factorisation  of  YTH~lY . 

We  now  show  that  the  vector  9  of  (3.4)  can  be  updated  rather  than  recomputed,  using  a 
technique  similar  to  that  given  in  Section  2.1  for  updating  X. 


&  A  better  conditioned  method 


T 


Theorem  3.  Let  p  end  9  denote  the  solutions  to  the  equations  (3.3)  and  (3.4)  at  the  point  x.  Let 
t  (*  =  *-{-  ap)  be  a  point  at  which  the  row  aT  is  added  to  A.  Assume  that  the  LY  factorisation 
of  A  and  the  Cholesky  factor  V  of  YTH~XY  hare  been  computed.  Let  u  and  p  denote  the 
corresponding  portions  of  V,  as  in  (2.3),  and  let  x  denote  the  last  column  ofY.  T hen  the  vector 
9  that  solves  (3.4)  at  I  is  given  by 

=  9j  ~  £vj,  j  =1, 2,..., t,  and  h+i  = 


where  v  and  (  are  defined  by 


Uv  =  u  and  (  = 


(a  —  1  )xTp 


Proof.  Since  YTp  —  0,  the  proof  is  essentially  the  same  as  for  Theorem  1.  | 

Theorem  3  shows  that  ^t2  additional  multiplications  are  needed  to  update  0  after  a  constraint 
is  added  to  the  working  set. 

Deleting  a  constraint  from  the  working  set.  When  a  constraint  is  deleted  from  the  working  set, 
the  LY  factorisation  of  A  may  be  updated  as  described  in  Section  A.2.2  of  the  Appendix,  using 
a  sequence  of  plane  rotations  that  restore  the  lower-triangular  form  of  L.  The  basic  relationship 
between  Y  and  Y  in  this  case  is 

YP  =  ( t  2),  (3.5) 

where  the  orthonormal  matrix  P  is  the  product  of  suitably  chosen  plane  rotations.  We  assume 
that  the  Cholesky  factor  0  of  YH~xYt  is  available. 

In  this  situation,  the  vector  9  can  be  computed  with  only  a  small  amount  of  additional  work 
beyond  that  required  to  update  L,  Y  and  V,  as  indicated  in  the  following  theorem. 

Theorem  4.  Let  9  and  p  denote  the  solutions  of  (3.3)  and  (3.4)  at  the  point  x,  and  suppose  that 
a  constraint  is  deleted  from  the  working  set  at  the  point  f  =  z  -j-p.  Assume  that  the  updated 
matrix  Y  in  the  LY  factorisation  of  A,  the  updated  Cholesky  factor  V  of  YTH~lY  and  the 
related  vector  a  have  been  computed.  Then  the  solution  9  of  (3.4)  at  t  is  given  by 

=  +  j  =  l,2,...,t-l, 


with 


u  =  pT$  and  &€  =  a, 


where  P  is  defined  in  (3.5). 

Proof.  Multiplying  (3.4)  by  PT  and  inserting  9  =  Pu,  we  obtain 


PTYTH~xYPu 


YTH~lY 

zTH~lY 


YTH~l2  \u__f  YTH~l?\ 
gTH~lE  J  V  *TH~l9  ) 


(3.6) 


Since  YtH  xg  =  YTH  lg,  the  form  of  (3.6)  shows  that  Lemma  1  can  be  invoked  with  YTH~lY 
and  YtH~xY  taking  the  roles  of  M  and  M  respectively.  It  follows  that 


where  €  satisfies  the  single  triangular  system  Oo  =  0.  | 
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Discussion  of  the  better  conditioned  method.  The  numerical  advantage  of  Method  2  compared  to 
the  low-storage  method  results  from  avoiding  the  use  of  X*  in  computing  the  search  direction  (see 
(1.5)).  If  AkH~lA%  is  ill-conditioned,  the  computed  value  of  X*  may  contain  significant  errors, 
which  may  then  propagate  to  p*.  The  accuracy  of  6k  in  Method  2  depends  on  the  conditioning  of 
YkH~1Yk,  which  is  no  worse  than  that  of  H  itself.  A  disadvantage  of  Method  2  is  the  additional 
storage  required  for  the  n  X  t*  matrix  Yk  and  the  t^-dimensional  triangular  factor  of  YkH~1Yk. 
Furthermore,  the  updates  required  at  each  iteration  are  more  co'Aly  than  in  Method  1  or  3  (see 
Section  5). 

The  following  table  indicates  the  number  of  multiplications  required  to  perform  each  com¬ 
putation  during  an  iteration  of  Method  2. 


Update  B 

Compute  p 

Compute  X 

Add  constraint 

Delete  i-th  conetraint 

wm 

na+»t 

na+3nt+|ta 

3n(t-t)a+$(t_0a 

4.  Method  8:  the  weighted  Gram-Schmidt  method 

Following  Bartels,  Golub  and  Saunders  (1970),  we  note  that  the  range-space  equations  may  be 
solved  using  the  factorisations 


H  =  RtR  and  AkRTl-(Lk  0  )Ql  (4.1) 

where  R  is  the  n  X  n  Cholesky  factor  of  H,  Lk  is  a  t*  X  tk  lower-triangular  matrix,  and  Qk  is 
an  n  X  n  orthogonal  matrix.  Note  that  AkH~lAk  =  LkLk  and  therefore  the  transpose  of  Lk  is 
identical  to  the  triangular  matrix  17*  (2.1)  of  Method  1. 

The  factorisations  (4.1)  provide  a  solution  to  the  range-space  equations  (1.4)  and  (1.5)  in  the 
form: 

=  YlR-T„.  (4.2) 

R,„  =  -ZkZltrT,t,  (4.3) 

where  Yk  and  Zk  are  the  n  X  t*  and  n  X  (n  —  tk)  sections  of  the  matrix  Qk,  i.e. 

Qk  =  (Yh  Zk). 

A  variant  of  (4.1)-(4.3)  has  been  used  by  Goldfarb  and  Idnani  (1981),  who  recur  the  matrix 
QlR~T. 

We  now  propose  an  alternative  method  based  on  (4.1)-(4.3),  in  which  we  tain  advantage  of 
the  identity  ZkZ\  =  I  —  YkY%  in  order  to  avoid  storing  Zk.  In  place  of  the  full  orthogonal 
factorisation  in  (4.1),  we  utilise  the  weighted  Grun-Schmidt  factorisation 

AhR~x=LkYl ,  (4.4) 

which  can  be  updated  as  described  by  Daniel  et  a I.  (1976)  (see  Section  A.3  of  the  Appendix). 


4.  The  weighted  Gram-Scbmidt  method 


With  this  approach,  X*  could  be  recurred  using  the  results  of  Theorems  1  and  2  (this  follows 
directly  from  the  equivalence  of  17*  and  the  transpose  of  L*).  However,  we  shall  show  that  a  more 
efficient  method  may  be  obtained  by  recurring  three  vectors  u*,  v*  and  to*,  which  are  defined  by 
the  relations 

Rruk  =  gk,  vk  =  Ykuk  “d  wk  =  Ykvk  —  «*•  (4.5) 

(Note  that  Y\ tok  =  0.)  Substitution  into  (4.2)  and  (4.3)  allows  X*  and  p*  to  be  defined  from 

LkK  =  vk  (4.6) 

and 

RPk  =  »*•  (4.T) 

In  practice,  initial  vectors  uo,  t>o  and  too  are  defined  from  (4.5)  in  terms  of  an  initial  point 
xq  and  the  weighted  Gram-Schmidt  factors  of  a  corresponding  initial  working  set  Ao-  (The  point 
zQ  must  satisfy  AqX0  =  bQ.)  Thereafter,  the  vectors  u*,  v*  and  to*  can  be  updated  at  negligible 
cost,  as  we  show  below.  The  principal  computational  effort  per  iteration  lies  in  updating  the 
factorisation  (4.4)  as  the  working  set  changes,  and  in  computing  p*  (and  X*  when  needed)  from 
(4.6)  and  (4.7). 

As  in  the  last  two  sections,  we  shall  drop  the  subscript  k  and  use  barred  quantities  to  denote 
a  single  change  in  the  working  set. 

Adding  a  constraint  to  the  working  set.  When  a  constraint  is  added  to  the  working  set,  a  new 
row  (say,  aT)  becomes  the  last  row  of  A.  The  relationship  between  the  old  and  new  matrices  Y 
and  Y  of  (4.4)  is  given  by 

f  =  (Y  z)  (4.8) 

(see  Section  A.3.1  of  the  Appendix).  The  following  theorem  describes  how  the  quantities  u,  v  and 
p  may  be  updated  following  a  constraint  addition. 

Theorem  5.  Let  p  denote  the  vector  that  satisfies  the  range-space  equation  (1.5)  at  the  point  x. 
Let  u,  v  and  to  satisfy  (4.5)  at  the  point  x.  Let  X  (X  —  z-\-  ap)  be  a  point  at  which  the  row  aT  is 
added  to  A.  Assume  that  the  updated  factors  L  and  ¥  of  A  =  LYTR  have  been  computed,  and 
that  z,  the  new  last  column  of  Y ,  is  available.  The  vectors  u,  v  and  to  are  updated  as  follows: 


0) 

a  =  u  +  aw; 

(4.9) 

(H) 

€  =  ^  V  where  v  —  zTH] 

(Hi) 

®  =  (1  —  a)to  +  vz. 

Proof.  Using  (4.1),  (4.7)  and  the  quadratic  nature  of  the  objective  function,  we  have 


=  9  +  a  Hp 
=  Rtu  +  aRTRp 
=  Rtu  +  aRTv, 


and  (4.9)  follows  immediately. 


10 


Range-Space  Method*  for  Convex  Quadratic  Programming 


To  prove  (ii),  we  use  the  definition  of  0,  (4.S)  and  (4.9)  to  give 

>-(r-jr> 

Since  YTw  =  0  and  v~YTu,  this  proves  (ii).  Similarly, 

|  ®  =  — <J  =  (y  *)(  l)~  (•»  +  ••), 

which  reduces  to  (Hi)  after  substituting  Yv  —  u  =  w.  | 

Note  that  in  a  dual  QP  algorithm  that  retains  dual  feasibility,  the  steplength  a  =  1  will 
usually  be  taken  when  a  constraint  is  added  to  the  working  set;  cases  (i)  and  (iii)  then  simplify. 
If  further  constraints  are  added  at  the  same  point,  Theorem  5  remains  true  with  a  =  0,  p  =  0, 
and  w  =  0. 

Deleting  a  constraint  from  the  working  set.  When  a  constraint  is  deleted  from  the  working  set, 
the  relationship  between  the  old  and  new  matrices  Y  and  Y  of  (4.4)  is  given  by 

YP  =  (Y  2),  (4.10) 

where  the  orthonormal  matrix  P  is  the  product  of  suitably  chosen  plane  rotations  (see  Section 
A.3.2  of  the  Appendix).  The  following  theorem  indicates  how  the  quantities  u,  v  and  to  may  be 
updated  in  this  case. 

Theorem  6.  Let  u,  v  and  w  satisfy  (4.5)  at  the  point  z,  and  suppose  that  a  constraint  is  deleted 
from  the  working  set  at  the  point  2  =  z  +  ctp,  where  p  satisfies  (4.7).  Assume  that  the  updated 
factors  L  and  Y  of  A  =  LYTR  have  been  computed,  so  that  the  relationship  between  the  old 
and  new  orthogonal  factors  is  given  by  (4.10).  Then 


0) 

G  =  u  -f  aw; 

(4.11) 

00 

^  ^  ^  =  PTv,  where  v  =  2TQ; 

(4.12) 

(iii) 

«  =  (1  —  a)w  —  vf. 

(4.13) 

Proof.  Result  (i)  follows  as  in  Theorem  5.  To  prove  (ii),  note  that,  since  0  =  YT0,  we  may  write 

cm;> 

where  v  =  gTU.  Using  (4.10)  and  (4.11)  gives 

Since  YTw  =  0  and  YTu  =  v,  this  gives  the  desired  result. 


5.  Conclusions 
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Finally,  to  prove  (Hi),  we  use  the  definition  of  0  to  write  the  identity 

<D  =  ?e  —  a  =  {?  f  —  a  —  uz. 

Using  (4.10),  (4.11)  and  (4.12)  gives 

0  =  YPPtv  —  u  —  aw  —  vi. 

Since  P  is  orthonormal  and  Yv  —  u  =  w,  this  expression  simplifies  to  (4.13),  as  required.  | 

Note  that  a  primal  QP  algorithm  will  usually  delete  a  constraint  only  when  a  =  1,  in  which 
case  (i)  and  (iii)  simplify.  If  more  than  one  constraint  is  deleted  at  the  same  point,  the  theorem 
remains  true  with  a  =  0,  p  =  0  and  w  =  0. 

Discussion  of  the  weighted  Gram-Sehmidt  method.  The  mqor  storage  required  by  the  weighted 
Gram-Schmidt  (WGS)  method  beyond  that  for  the  low-storage  method  is  for  the  nXt  orthonormal 
matrix  1*.  The  WGS  method  has  a  storage  advantage  compared  to  Method  2,  since  it  requires 
storage  of  only  one  t-dimensional  triangular  matrix.  A  numerical  advantage  of  the  WGS  method 
(as  of  Method  2)  is  that  X*  is  not  used  explicitly  to  compute  p*.  However,  the  conditioning  of 
A*/?-1  may  be  worse  than  that  of  A*  itself,  and  hence  there  may  be  more  difficulties  with  the 
factorisation  (4.4)  than  with  (3.1). 

The  following  table  gives  the  number  of  multiplications  needed  to  perform  each  computation 
in  the  weighted  Gram-Schmidt  method. 


Compute  p 

Compote  X 

Add  constraint 

Dolote  <-th  constraint 

$n»+2nt 

Sn(t-0+S(t-t)» 

S.  Conclusions 

Table  1  summarises  the  major  work  (measured  by  multiplications)  associated  with  Methods  1,  2 
and  3.  In  any  active-set  method,  the  quantities  Ax  and  Ap  must  be  computed,  where  A  contains 
the  constraints  not  already  in  the  working  set.  In  practice,  the  vector  Ap  is  computed,  and 
Ax  is  updated  using  At  =  Ax  +  a  Ap.  The  work  needed  to  perform  this  operation  (n(m  —  t) 
multiplications)  is  not  included  in  the  table,  since  it  is  the  same  for  all  methods. 

The  table  gives  the  multiplication  counts  for  two  types  of  iteration  of  a  typical  feasible-point 
active-set  QP  method:  (i)  compute  p  and  add  a  constraint;  (ii)  compute  p,  compute  X  and  delete 
a  constraint.  As  indicated  by  the  summary  tables  for  each  method,  the  amount  of  work  required 
to  delete  a  constraint  depends  on  the  index  of  the  constraint  to  be  deleted.  The  figure  in  the 
table  corresponds  to  the  case  when  the  deleted  constraint  is  the  middle  row  of  the  working  set 
(i.e.,  has  index  $t).  The  “average”  figure  is  the  average  work  between  iterations  of  type  (i)  and 
(ii).  The  “average*  value  is  given  in  a  general  form,  and  for  three  specific  values  of  t  (t  =  0,  |n 
and  n),  in  order  to  indicate  the  relative  efficiencies  of  the  three  methods. 

Table  2  summarises  the  factorisations  used  in  each  of  the  three  methods,  and  the  associated 
storage  requirements.  The  entries  that  define  storage  give  the  number  of  floating-point  words 
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Table  1 

Summary  of  Work  Required  for  Range-Space  QP  Methods 


Cost  to 

Method  1 
(Low  Storage) 

Method  2 

(Better  Conditioned) 

Method  S 
(WGS) 

Compote  p, 
add  constraint 

2»a+2nt+ta 

2»a+4n«+ta 

»a+2nt 

Compute  p  and  X, 
delete  constraint 

»a+nt+f<a 

»a+|nt+yta 

§»a+f»t+fta 

Average 

Jn’+Jnt+Jt1 

Jna+Xnt+ll** 

|na+|»t+fta 

Average  (<  =  0) 

1.5na 

I.5na 

f"a 

Average  (( =  £n) 

2.5na 

3.5na 

1.8na 

Average  (t  =  ») 

4.1»a 

6.S»a 

S.lna 

TabU  2 

Summary  of  Storage  Required  for  Range-Space  QP  Methods 


Method  1 
(Low  Storage) 

Method  2 

(Better  Conditioned) 

Methods 

(WGS) 

Storage  for  A 

mn 

Same 

Same 

Storage  for  H  (J^K) 

K 

Same 

Same 

Additional  factorisations 

ArH~lA  =  UTU 

A  =  LYr 
YtH~1Y  =  UTV 

AR~l  =LYt 

Additioaal  storage 

t‘L.. 

References 


IS 


required  for  a  dense  problem.  The  entry  "Same"  means  that  the  storage  required  is  the  same  for 
all  methods. 

If  only  one  of  the  three  methods  were  to  be  chosen  for  general  use,  we  would  favor  the 
choice  of  the  weighted  Gram-Schmidt  method  (Method  3).  Its  storage  requirements  are  relatively 
modest,  and  it  tends  to  encounter  relatively  few  difficulties  with  numerical  stability.  Although  the 
amount  of  storage  required  is  more  than  that  for  Method  1,  in  our  view  the  superior  numerical 
properties  justify  the  additional  storage  cost.  Method  2  is  the  most  numerically  stable,  but 
requires  significant  additional  storage  and  work. 

The  methods  described  in  this  paper  should  be  effective  in  solving  QP  problems  with  few 
constraints  whenever  the  sise  or  structure  of  the  Hessian  is  such  that  its  Cholesky  factors  can 
be  computed  and  stored.  Thus,  these  methods  may  be  applied  directly  to  large  problems  in 
which  the  Hessian  is  suitably  sparse  and  structured  (for  example,  when  the  Hessian  is  diagonal 
or  banded). 

Sparsity  in  the  constraints  will  not  tenr"  to  be  helpful  in  Methods  1  and  3,  since  sparsity  in 
A  does  not  carry  over  to  the  factorisations  involving  A.  For  Method  2,  it  might  be  possible  to 
compute  a  compact  representation  of  Y  if  A  is  sparse. 

If  the  available  storage  is  sufficient  for  a  t  X  t  matrix  but  not  an  n  X  t  matrix,  Method  1  is  the 
only  possible  choice.  If  an  n  X  t  matrix  may  be  stored,  any  of  the  methods  may  be  used.  Unless 
numerical  stability  is  of  critical  importance,  Method  2  would  generally  not  be  considered.  The 
choice  between  Methods  1  and  3  depends  on  the  relative  ease  with  which  the  system  of  equations 
Hz  =  b  can  be  solved  given  a  factorisation  of  H.  Methods  1  and  3  are  similar  in  efficiency  if  this 
process  requires  between  nt  and  2 nt  operations;  Method  1  is  preferred  if  the  number  of  operations 
is  less  than  nt,  and  Method  3  if  the  number  is  greater  than  2nt. 
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Appendix  — *  Method*  for  modifying  matrix  factorization! 

In  this  Appendix  we  shall  outline  techniques  for  updating  the  matrix  factorisations  associated 
with  a  t  X  »  matrix  A  and  the  nxt  orthonormal  matrix  Y  that  forms  a  basis  for  the  range  space 
of  the  rows  of  A.  The  discussion  is  roughly  divided  into  two  parts  (although  much  of  the  basic 
theory  is  common  to  both  parts).  In  the  first  part,  we  consider  how  to  update  the  factorisations 

A  =  LYt  and  YTMY  =  UTU,  (AI) 

where  M  is  an  n  X  n  positive-definite  symmetric  matrix  and  U  is  a  t  X  t  upper-triangular  matrix, 
when  row  changes  are  made  to  the  matrix  A. 

In  the  second  part  we  are  concerned  with  updates  following  a  row  change  in  A  of  the 
factorisation 

A  =  LYtR, 

where  H  is  an  »  X  n  symmetric  positive-definite  matrix  with  Cholesky  factor  R  (so  that  H  = 
RtR),  and  L  is  a  t  x  t  lower-triangular  matrix. 

Some  of  the  procedures  described  here  are  simple  extensions  of  those  described  by  Gill  et  a/. 
(1974)  and  Daniel  et  a I.  (1976).  Some  methods  that  have  been  described  previously  are  repeated 
here  for  completeness. 

The  discussion  of  the  modifications  will  assume  a  general  familiarity  with  the  properties  of 
plane  rotations.  Sequences  of  plane  rotations  are  used  to  introduce  seros  into  appropriate  positions 
of  a  vector  or  matrix,  and  have  exceptional  properties  of  numerical  stability  (see,  e.g.,  Wilkinson, 
1965,  pp.  47-48).  Throughout  the  Appendix  we  shall  assume  the  three-multiplication  form  of  a 
plane  rotation;  see  Gill  et  a J.  (1974). 

We  shall  illustrate  each  modification  process  using  sequences  of  simple  diagrams,  following 
the  conventions  of  Cox  (1981)  to  show  the  effects  of  the  plane  rotations.  Each  diagram  depicts 
the  changes  resulting  from  one  plane  rotation.  The  following  symbols  are  used: 

x  denotes  a  non-iero  element  that  is  not  altered; 
m  denotes  a  non-sero  element  that  is  modified; 
f  denotes  a  previously  zero  element  that  is  filled  in; 

0  denotes  a  previously  non- zero  element  that  is  annihilated;  and 
(or  blank)  denotes  a  zero  element  that  is  unaltered. 

In  the  algebraic  representation  of  the  updates,  barred  quantities  will  represent  the  “new*  values. 


A.l.  A  basic  tod  —  updating  after  a  column  permutation 

One  of  the  basic  operations  in  updating  factorisations  is  the  need  to  restore  a  matrix  to  triangular 
form  after  one  of  its  rows  or  columns  is  moved  to  the  first  or  last  position.  We  shall  illustrate 
this  operation  in  the  case  where  it  is  required  to  move  an  arbitrary  column  of  an  upper-triangular 
matrix  R  to  the  last  position.  There  is  no  loss  of  generality  if  we  consider  moving  the  first  column 
to  the  last  position  (this  is  the  most  expensive  case  to  consider). 


A.1.1.  Method  It  using  a  column  interchange.  Suppose  that  II  is  a  permutation  matrix  that 
interchanges  the  first  and  last  columns  of  R.  We  shall  use  a  5  x  5  example  to  illustrate  the 
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column  interchange.  The  permuted  matrix  has  the  form 


c 


*  *  *  *  V 

XXX  I 


nn  = 


X 


X 


X 


r 

v  X 


X 


J 


We  transform  RTI  to  upper-triangular  form  in  two  stages.  First,  we  applj  a  sequence  of  plane 
rotations  on  the  left  so  that  the  “vertical  spike”  of  nonsero  sub-diagonal  elements  is  transformed 
into  a  “horizontal  spike*.  The  horisontal  spike  is  eliminated  with  another  sequence  of  plane 
rotations  applied  on  the  left. 

To  eliminate  the  vertical  spike,  a  backward  sweep  of  plane  rotations  is  applied  to  the  rows  of 
RII  in  the  planes  (n,  n  —  1),  (n,  n  —  2), . . . ,  (n,  2).  We  illustrate  the  effect  of  the  transformation 
on  the  5x5  example. 
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The  resulting  matrix  is  restored  to  upper-triangular  form  by  a  forward  sweep  of  (row)  rotations 
in  the  planes  (1,  n),  (2,  n), . . .,  (n  —  1,  n).  In  the  5  X  5  case  we  have 
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If  both  sweeps  of  plane  rotations  are  denoted  by  a  single  orthonormal  matrix  P,  we  have 

R  =  PRII.  (A2) 


A.1.2.  Method  2:  using  column  shifts.  There  are  other  permutation  matrices  i7  that  will  move 
the  first  column  of  R  to  the  last  position.  A  more  common  method  is  to  move  the  first  column 
to  the  last  position  and  shift  all  the  remaining  columns  one  position  to  the  left.  In  this  case,  the 
permuted  matrix  has  the  upper- Hessenberg  form 
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This  matrix  is  restored  to  upper-triangular  form  b y  a  forward  sweep  of  row  rotations  applied  on 
the  left  to  eliminate  the  subdiagonal  elements,  as  shown  in  the  following  diagram. 
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The  resulting  transformation  is  of  the  form  (A2)  except  that  P  comprises  a  single  sweep  of 
rotations  instead  of  a  double  sweep. 


A.I.S.  Discussion.  Clearly,  the  updating  techniques  of  the  last  two  sections  are  intimately  related. 
(For  example,  in  Section  A.1.2  the  transformation  of  the  upper-Hessenberg  matrix  to  triangular 
form  could  have  been  effected  by  first  applying  the  permutation  that  mores  the  first  row  to  the 
last  position  and  shifts  the  remaining  rows  up  by  one  position.  The  intermediate  matrix  is  then 
triangular,  with  a  "horisontal  spike”  in  the  last  row.  The  spike  can  be  eliminated  using  the 
method  of  Section  A.  1.1.)  We  hare  considered  these  transformations  separately  because  each  of 
the  two  methods  will  be  most  suitable  under  some  circumstances.  For  example,  the  rotations 
used  to  transform  R  may  also  be  applied  to  another  matrix  that  itself  must  be  maintained  in 
some  specific  form  (this  is  the  case  for  the  methods  described  in  Section  A.3).  In  another  case, 
the  triangular  matrix  may  need  to  be  stored  as  a  one-dimensional  array  for  maximum  efficiency 
in  some  high-leyel  computer  languages.  In  this  case,  Method  1  would  be  most  suitable  because 
no  new  elements  are  created  outside  the  triangular  structure  (except  for  the  spikes,  which  can  be 
held  in  a  work  yector). 

If  there  is  no  oyerriding  reason  to  use  one  method  or  the  other,  the  decision  may  be  influenced 
by  other  factors.  For  example,  although  Method  1  requires  two  sweeps  of  (row)  rotations,  some  of 
the  elements  in  the  vertical  and  horisontal  spikes  could  be  sero,  in  which  case  the  corresponding 
rotations  may  be  skipped.  In  contrast,  although  Method  2  appears  to  require  just  one  row  sweep, 
the  operations  involved  in  shifting  the  columns  one  position  to  the  left  should  be  regarded  as 
being  equivalent  to  a  sweep  of  "cheap”  column  rotations.  Also,  assuming  A  to  be  nonsingular, 
the  rotations  in  the  subsequent  row  sweep  can  never  be  skipped  (although  some  of  them  could  be 
simple  interchanges). 

In  the  last  two  sections  we  have  shown  that  the  restoration  of  a  triangular  matrix  to  triangular 
form  after  a  row  or  column  permutation  implicitly  or  explicitly  involves  the  application  of  plane 
rotations  to  one  or  more  of  the  following  three  types  of  matrix: 


(« 

"horisontal  spike" 

(v  R ) 

"vertical  spike” 

(A3) 

{R  v ) 

"upper-Hessenberg* 

(A4) 

The  structure  typified  by  these  three  standard  cases  will  occur  repeatedly  in  later  sections. 

A.S.  Updating  the  factors  of  A  and  YTMY 

A.2.1.  Adding  a  new  row  to  A.  When  a  new  row  is  added  to  A,  the  row  dimension  of  L  and  the 
column  dimension  of  Y  in  (Al)  both  increase  by  one.  Without  loss  of  generality  we  shall  assume 
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that  A  it  a  t  X  »  matrix,  and  that  the  new  row  is  added  in  the  last  position,  i.e., 


A  = 


The  computation  of  the  new  column  of  Y  is  equivalent  to  a  single  step  of  the  row-wise  computa¬ 
tion  of  the  orthogonal  factorisation  of  A  by  the  modified  Gram-Schmidt  algorithm.  The  new 
orthogonal  factor  Y  is  defined  as 

Y  =  (Y  z),  (A5) 


where  z  is  given  by  a  multiple  of  the  vector  a  orthogonalised  with  respect  to  the  orthonormal  set 
of  columns  of  Y,  i.e.  z  =  r(I  —  YYT)a,  where  r  is  a  normalising  factor.  For  more  details  of  the 
computation  of  z,  including  the  use  of  reorthogonalisation,  see  Daniel  et  a I.  (1976). 

Given  z,  the  new  row  of  L  is  found  from  the  identity 


=(.AM 


L  0 
lT  7 


)(P 


t  )' 


(A6) 


where  the  vector  /  and  scalar  7  are  to  be  determined.  The  last  row  of  this  identity  gives 


a  =  y/-f-7*. 


(A7) 


Premultiplication  of  this  equation  by  YT  gives  l  =  Yra  because  of  the  orthogonality  of  the 
columns  of  (Y  z).  Similarly,  premultiplication  of  (A7)  by  zT  gives  7  =  zTa. 

The  relationship  (A5)  implies  that 


?W  =  (YTtMY 

\  ztMY 


YtMz  \ 
ztMz  ) 


The  Cholesky  factor  U  of  YTMY  is  given  by 

0  =  ^  ^  U  where  UTu  =  YTMz  and  p  =  zTMz  —  uru 


(see  Wilkinson,  1965,  pp.  229-230).  If  YtMY  is  positive  definite,  zTMz  —  uTu  >  0. 

The  work  necessary  to  update  the  factors  L,  Y  and  U  when  adding  a  new  row  toatXn 
matrix  A  is  dominated  by  the  -j-  »»a  +  nt  multiplications  necessary  to  form  u  and  p,  and  the 
2(k  -f  l)nt  multiplications  necessary  to  form  z  and  the  new  row  of  L,  where  k  (k  >  0)  is  the 
number  of  reorthogonalisation  steps  required.  For  well  conditioned  problems,  reorthogonalisation 
is  usually  not  required. 


A.3.3.  Deleting  a  row  from  A.  Suppose  now  that  a  row  (say,  the  i-th)  is  deleted  from  at  X  n 
matrix  A,  so  that  the  row  dimension  of  A,  the  column  dimension  of  Y  and  the  dimension  of  L 
in  (Al)  are  decreased  by  one.  The  method  given  in  the  last  section  for  adding  a  row  to  a  matrix 
implies  that  if  the  last  row  of  A  is  deleted,  the  new  factors  are  identical  to  the  old  except  that 
the  last  column  of  Y  and  row  and  column  of  L  are  omitted.  Consequently,  in  the  general  case, 
the  only  nontrivial  work  necessary  to  perform  a  column  deletion  is  that  required  to  update  the 
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factors  after  the  deleted  column  is  moved  to  the  last  position.  We  shall  show  that  this  updating 
can  be  achieved  using  the  methods  of  Section  A.l. 

From  (Al),  we  have 

( £)=«''*  <"> 
where  S  it  at  X  t  matrix  that  is  identical  to  L  except  that  the  *-th  row  is  moved  to  the  last 
position  and  all  the  remaining  rows  have  been  shifted  upwards.  For  example,  in  the  case  t  =  6 
and  « =  3,  we  have 

/  **  \ 


5  = 


X  X  X  X 

X  X  X  X  X 


|  X  X  X  X  X  X 

V  x  x  x  / 


The  matrix  comprising  the  last  t — i  1  columns  of  S  is  the  transpose  of  a  matrix  of  the  form 
(A4)  and  it  maj  be  restored  to  lower-triangular  form  using  a  forward  sweep  of  column  rotations 
(say,  P)  as  in  Method  2  of  Section  A.1.2.  The  effect  of  these  rotations  is  to  give  the  factorisation 

(£)-W 

where  Lit  at  Xt  lower-triangular  matrix.  Let  L  and  the  product  YP  be  partitioned  so  that 

i  =  (fT  “d  rp==(r  *)•  (A9) 

where  L  is  a  (t  —  1)  X  (t  —  1)  lower-triangular  matrix,  T  is  an  n  x  (t  —  1)  orthogonal  matrix, 
and  i  is  an  n- vector.  A  comparison  of  (A9)  with  (A6)  shows  that  A  —  LfT  as  required. 

The  plane  rotations  applied  to  Y  also  transform  the  Cholesky  factor  U  of  YTMY.  The 
order  of  the  rotations  in  P  has  been  chosen  so  that  each  transformation  introduces  a  single  new 
subdiagonal  element  in  the  upper-triangular  matrix  U,  as  shown  in  the  following  sequence  of 
diagrams. 


X  X  X  X  X  X 

X  X  X  X  X 

X  X  X  X 

XXX 
X  X 

X 


x  x  flu  m  x  x 
x  IB  m  x  x 
m  m  x  x 
f  m  x  x 

X  X 
X 


x  x  x  a  in  x 
x  x  a  m  x 
x  a  a  x 
x  m  a  x 
fax 

X 


x  x  x  x  m  m 

xxx  m  id 
x  x  n  n 
x  x  m  m 
x  m  m 
f  m 


The  transformed  matrix  UP  is  now  in  upper-Hessenberg  form  similar  to  (A4).  As  described 
in  Section  A.1.2,  this  matrix  may  be  restored  to  upper-triangular  form  by  a  (partial)  sweep  of  row 
rotations  (say,  the  matrix  Q),  which  is  applied  on  the  /eft  to  eliminate  the  subdiagonal  elements. 
The  relationship  between  U  and  its  transformed  version  is 


where  0  is  the  upper-triangular  factor  of 

The  work  associated  with  deleting  a  row  from  A  comprises  |(f — a)*  multiplications  to  restore 
8,  3n(t  —  i)  multiplications  to  compute  YP,  and  3 (t  —  if  multiplications  for  the  two  sweeps  of 
rotations  applied  to  U. 
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A. 8.  Updating  the  factorisation  A  —  LYTR 

A.3.1.  Adding  a  row  to  A.  When  a  new  row  is  added  to  A,  the  row  dimension  of  L  and  the 
column  dimension  of  Y  both  increase  by  one.  Suppose  that  A  is  a  t  X  n  matrix.  If  the  row  is 
added  to  the  last  position,  the  updating  of  Y  and  L  is  identical  to  that  described  in  Section  A.2.1. 
In  this  case  the  new  column  x  is  given  by  a  multiple  of  the  vector  R~ T  a  orthogonalised  with 
respect  to  the  orthonormal  set  of  columns  of  Y,  i.e.  z  =  r(I  —  YYT)q,  where  q  satisfies  RTq  =  a 
and  r  is  a  normalizing  factor. 

The  number  of  multiplications  associated  with  adding  a  new  row  to  A  includes  the  following 
(where  only  the  highest-order  term  is  given):  Jn2  to  form  q]  and  2(Jfc  +  l)nt  to  form  *  and  the 
new  row  of  L,  where  k  is  the  number  of  reorthogonalisation  steps  required. 

A.3.1.  Deleting  a  row  from  A.  If  the  i-th  row,  say  aT,  is  deleted  from  a  tx»  matrix  A,  the 
column  dimension  of  Y  and  the  dimension  of  L  are  decreased  by  one.  (As  in  Section  A.3.1,  the 
dimension  of  R  is  unaltered.)  The  method  used  to  update  the  factors  is  identical  to  that  given 
in  Section  A.2.2,  except  that  it  is  not  necessary  to  select  P  so  that  the  structure  of  the  factor  U 
is  maintained.  In  particular,  the  factorisation 

(£)-"'* 

analogous  to  (A8)  may  be  obtained  by  interchanging  the  s-th  and  t-th  rows  of  A.  In  this  case, 
the  matrix  5  has  a  form  similar  to  the  transpose  of  (A3)  and  may  be  reduced  to  lower-triangular 
form  by  Method  2  of  Section  A.1.1. 

The  work  associated  with  deleting  the  i-th  row  from  A  comprises  3(t  —  s')2  multiplications  to 
triangularise  S  and  3n(i  —  s)  multiplications  to  compute  Y  P. 
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This  paper  describes  in  detail  three  alternative  range-space  methods,  each 
based  upon  updating  different  matrix  factorizations.  It  is  shown  that  the 
choice  of  method  depends  upon  the  relative  importance  of  storage  needs, 
computational  efficiency  and  numerical  reliability. 

The  updating  methods  described  are  applicable  to  both  primal  and  dual 
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